Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5

Hits 1 – 20 of 81

1	MAGIC DUST FOR CROSS-LINGUAL ADAPTATION OF MONOLINGUAL WAV2VEC-2.0
	Khurana, Sameer; Laurent, Antoine; Glass, James
	In: ICASSP 2022 ; https://hal.archives-ouvertes.fr/hal-03544515 ; ICASSP 2022, May 2022, Singapour, Singapore (2022)
	BASE
	Show details

2	Simple and Effective Unsupervised Speech Synthesis ...
	Liu, Alexander H.; Lai, Cheng-I Jeff; Hsu, Wei-Ning. - : arXiv, 2022
	BASE
	Show details

3	Learning Audio-Video Language Representations
	Rouditchenko, Andrew. - : Massachusetts Institute of Technology, 2021
	BASE
	Show details

4	Cascaded Multilingual Audio-Visual Learning from Videos ...
	Rouditchenko, Andrew; Boggust, Angie; Harwath, David. - : arXiv, 2021
	BASE
	Show details

5	Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0 ...
	Khurana, Sameer; Laurent, Antoine; Glass, James. - : arXiv, 2021
	BASE
	Show details

6	Text-Free Image-to-Speech Synthesis Using Learned Segmental Units ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Glass, James; Harwath, David; Hsu, Wei-Ning; Miller, Tyler; Song, Christopher. - : Underline Science Inc., 2021
	Abstract: Read paper: https://www.aclanthology.org/2021.acl-long.411 Abstract: In this paper we present the first model for directly synthesizing fluent, natural-sounding spoken audio captions for images that does not require natural language text as an intermediate representation or source of supervision. Instead, we connect the image captioning module and the speech synthesis module with a set of discrete, sub-word speech units that are discovered with a self-supervised visual grounding task. We conduct experiments on the Flickr8k spoken caption dataset in addition to a novel corpus of spoken audio captions collected for the popular MSCOCO dataset, demonstrating that our generated captions also capture diverse visual semantics of the images they describe. We investigate several different intermediate speech representations, and empirically find that the representation must satisfy several important properties to serve as drop-in replacements for text. ...
	Keyword: Computational Linguistics; Condensed Matter Physics; Deep Learning; Electromagnetism; FOS Physical sciences; Information and Knowledge Engineering; Neural Network; Semantics
	URL: https://underline.io/lecture/25832-text-free-image-to-speech-synthesis-using-learned-segmental-units https://dx.doi.org/10.48448/r06d-y818
	BASE
	Hide details

7	Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation? ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Glass, James; He, Tianxing. - : Underline Science Inc., 2021
	BASE
	Show details

8	Mitigating Biases in Toxic Language Detection through Invariant Rationalization ...
	Chuang, Yung-Sung; Gao, Mingye; Luo, Hongyin. - : arXiv, 2021
	BASE
	Show details

9	Mitigating Biases in Toxic Language Detection through Invariant Rationalization ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Chen, Yun-Nung; Chuang, Yung-Sung. - : Underline Science Inc., 2021
	BASE
	Show details

10	A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
	Khurana, Sameer; Laurent, Antoine; Hsu, Wei-Ning...
	In: Interspeech 2020 ; https://hal.archives-ouvertes.fr/hal-02912029 ; Interspeech 2020, Oct 2020, Shanghai, China (2020)
	BASE
	Show details

11	Similarity Analysis of Contextual Word Representation Models ...
	Wu, John M.; Belinkov, Yonatan; Sajjad, Hassan. - : arXiv, 2020
	BASE
	Show details

12	CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning ...
	Khurana, Sameer; Laurent, Antoine; Glass, James. - : arXiv, 2020
	BASE
	Show details

13	A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning ...
	Khurana, Sameer; Laurent, Antoine; Hsu, Wei-Ning. - : arXiv, 2020
	BASE
	Show details

14	What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context ...
	Baly, Ramy; Karadzhov, Georgi; An, Jisun. - : arXiv, 2020
	BASE
	Show details

15	Vector-Quantized Autoregressive Predictive Coding ...
	Chung, Yu-An; Tang, Hao; Glass, James. - : arXiv, 2020
	BASE
	Show details

16	Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies ...
	Liu, Alexander H.; Chung, Yu-An; Glass, James. - : arXiv, 2020
	BASE
	Show details

17	Improved Speech Representations with Multi-Target Autoregressive Predictive Coding ...
	Chung, Yu-An; Glass, James. - : arXiv, 2020
	BASE
	Show details

18	Classifying Alzheimer's Disease Using Audio and Text-Based Representations of Speech
	Haulcy, R'mani(R'mani Symon); Glass, James R
	In: Frontiers (2020)
	BASE
	Show details

19	Identification of digital voice biomarkers for cognitive health
	Lin, Honghuang; Karjadi, Cody; Ang, Ting F. A....
	In: Explor Med (2020)
	BASE
	Show details

20	On the Linguistic Representational Power of Neural Machine Translation Models
	Belinkov, Yonatan; Durrani, Nadir; Dalvi, Fahim...
	In: Computational Linguistics, Vol 46, Iss 1, Pp 1-52 (2020) (2020)
	BASE
	Show details

Page: 1 2 3 4 5

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern